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Chapter 2: Discrete Random Variables 


4- Expectation/ Mean, Variance, Standard deviation, Moment 


Expectation 


We define the expectation (also called the expected value or the mean) of a random variable X, E[X], with PMF 
Px, by 


ELX] = ) xpx(x) 


xEE 


= Example: 


Consider two independent coin tosses, each with a 3/4 probability of a head, and let X be the number of 
heads obtained. What is its PMF? What is its expectation? 


Important remark: 


When dealing with random variables that take a countably infinite number of values, one has to deal 


with the possibility that the infinite sum )),¢2xpx(x) is not well-defined. More concretely, we will say 
that the expectation is well-defined if }).-¢xpx(x) < ©. 


For example: what is the mean of the random variable X that take value of 2 with probability 2~*, for 
212,09 san? 
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4- Expectation/ Mean, Variance, Standard deviation, Moment 


° Expectation 


It is useful to view the mean/expectation of X as a "representative" value of X, which lies somewhere in the 


middle of range. We can make this statement more precise by viewing the mean as the center of gravity of the 
random variable. 


= Example: 
Dx (x) 


x 
The expectation of X 
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¢ Variance 


The variance of a random variable X, denoted Var(X), is defined as the mean of the random variable 
(X — E[X])?, ie., 


Var(X) = E[(X — E[X])?] 
The variance provide a measure of dispersion of the random variable around its mean. 


= Question: 

= Can the variance be positive ? 

= Can the mean be positive ? 

= When the variance is equal to zero? 
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e Standard Deviation 


The standard deviation of X is defined as the square root of its variance, and is denoted oy 


Oxy =./Var(X) 
The standard deviation is another measure of dispersion of the random variable around its mean. Often, it is 
easier to interpret the dispersion using the standard deviation than the variance, since the standard deviation has 


the same unit as x. 


= Question: 


=" Can the standard deviation be positive ? 


Exercice: 
Consider the random variable X with PMF 
py (x) = 9 ifx = —4,—-3,—2,—1,0,1,2,3,4 
0 otherwise. 
What is the expectation, variance, and standard deviation of X? 
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¢ Moment of order n: 


¢ The moment of order n is given by the expectation of X”: E[X”] 


¢ The central moment of order n is given by the expectation of (X — E[X])”: 
E[(X — E[X])"] 


= Remark: 


= The expectation of X is called Moment of first order. 
= The variance of X is also called Central Moment of 2" order. 
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¢ Expectation for functions of random variables : 


Let X be arandom variable with PMF py , and let g(X) be a function of X. Then, the expected value of the 
random variable g(X) is given by 


ELg(X) = ) gx )Px() 


XEE 
Exercice: 


Using the formula above, we can compute the variance and moment of order n differently: 


Var(X) = E[(X — E[X])]=? 
E[X"] =? 
What is the variance of 


if x = —4,—3,—-2, —-1,0,1,2,3,4 


otherwise. 
Using this formula? 
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¢ Properties of the mean and variance: 


Let X be a random variable with mean E[X], and variance Var(X). Leta € Rand b ER, the random 
variable aX + b has the following mean and variance 
E[axX + b] = aE[X] +b 
and 
Var(aX + b) = a*Var(X) 


= Proof: 
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¢ A convenient formula of the variance: 


Let us give a convenient alternative formula for the variance of a random variable X. 
Let X be a random variable, its variance can also be calculated as follows 


Var(X) = E[X2] — E[X]? 


= Proof: 
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¢ A convenient formula of the variance: 


Let us give a convenient alternative formula for the variance of a random variable X. 
Let X be a random variable, its variance can also be calculated as follows 


Var(X) = E[X2] — E[x]? 


= Proof: 


= Important remark: 
unless g(X) Is a linear function, it is not generally true that E[g(X)] is equal to g(E[X)). 
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= Exercise: 


If the weather is good (which happens with probability 0.6) . Ayman walks the 2 Kms to class at a speed 
of V = 10 Km per hour, and otherwise rides his motorcycle at a speed of V = 60 Km per hour. What is the 
mean of the time T to get to class? 
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= Exercise: 


Consider a quiz game where a person is given two questions and must decide which one to answer first. 
QuestionA will be answered correctly with probability 0.8 and the person will receive 1000 Dhs, while 
QuestionB will be answered correctly with probability 0.5, and the person will then receive as prize 2000 
Dhs. If the first question attempted is answered incorrectly, the quiz terminates; that is the person is not 
allowed to attempt the second question. If the first question is answered correctly, the person is allowed 
to attempt the second question. Which question should be answered first to maximize the expected value 


Probability & Statistics -Pr. Hajar El Hammouti - Computer Science School 


Chapter 2: Discrete Random Variables 


4- Expectation/ Mean, Variance, Standard deviation, Moment 


¢ Mean and Variance of Some Common Random Variables: 


Random Parameter Mean/Expectation Variance 
Variable 


= Bernoulli  py(x) p(1-p) 
p ifx=1 
1—p ifx=0 
= Binomial = py(k) = (7)p*(1—p)™*, pn np np(1 — p) 
k= 01 nn 
= Geometric py(k) = S — p)k-1p, p L ee (2 
ba iD. p p 
: i _3ak 
Poisson ny(k) =e Hele — A A A 
k = 0,1,2,.. 
= Uniform 
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5- Joint Probability Mass Function 


¢ Joint Probability Mass Function: 


Consider two discrete random variables X and Y. The probabilities of the values that X and Y can take are 
captured by the joint probability mass function of X and Y, denoted 


Dxy(%,y) = POX = x,Y = y) = PX = x} N{¥ = y}) 


If A is the set of all pairs (x,y) that have a certain property, then 


P(XYeA)= > pxvGey) 
(xyeA 
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5- Joint Probability Mass Function 
¢ Marginal Probability Mass Function: 


Consider two discrete random variables X and Y. The marginal probability mass function of X and Y can be 
obtained by summing over all values of y and x, respectively 


Dx(x) = ae Dxy(x%,y) and Py(y) = dix Pry (%,y) 


= Example (Illustration of joint PMF using a table): 
y 


Said ate aia. atsh ate iwer etch ate ehesathaeiahadcseiasedssale.arciahenatesecetaas, 


0 1/20. 1/20 1/20 — What is p,.(2) and py(2)? 
1/20 2/20 3/20 1/20 —_,+ Row's 

sil beracertsgusee’ nailcdioctarets Cideattenecsines! ovabeetg ee sum: 
1/20 2/20 3/20 1/20 —* py 


ee ere et eee er ee eee eo 


1/20 1/20. 1/20 0 ——> 


Columns’s 
sum: 


Dy. (x) 
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¢ Functions of Multiple Random Variables: 


Let g be a mapping from E to E, and X and Y two random variables. The random variable Z = g(X,Y) has 
the following PMF 


pz(Z) = » Dxy (x,y) 
{aylg@y)=z} 


Let g be a mapping from E*E to E. The expected value of the function g of the two random variables 
extends as follows 


E[g(X,Y)] = » G(X, Y)Pxy(%y) 
(x,y)€E*E 


In the special case where the function g is linear of the form g(x, y) = ax + by +c, with a, b,c some 


scalars, then 
ElaX + bY +c] = aE[X|]+ DE[Y] +c 
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5- Joint Probability Mass Function 


¢ More than Two Random Variables: 


¢ Consider three discrete random variables X ,Y and Z. The probabilities of the values that X and Y and Z 
can take are captured by the joint probability mass function of X and Y and Z, denoted 


Pxyz(x%,y,Z) = P(X = x,Y = y,Z _ Z) = P(X = x} M {Y = y} M {Z = z}) 
¢ The corresponding marginal PMFs can be written 
Dx y (x, y) — ye Dxy,z(X, y) Z) and Py(y) a De ps Dxy,z(%, y, Z) and Dx (x) = Diz Diy Dxy,z(%, y) Z) 


¢ Let g be a mapping from E*E*E to E. The corresponding expectation of g(X, Y, Z) can be calculated as 


ELIQXY.Z1= ) — gey,z)Pev206y,2) 
(x,y,Z)€E*E*E 
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= Exercise: 


Aclass has 300 students and each student has probability 1/3 of getting an A, independent of any other 
student . What is the mean of X, X being the number of students that get an A? 


= Exercise: 


Suppose that n people throw their hats in a box and then each picks one hat at random. (Each hat can 
be picked by only one person, and each assignment of hats to persons Is equally likely). What is the 
expected value of X , X being the number of people that get back their own hat? 
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¢ Conditioning a random variable on an event: 
The conditional PMF of a random variable X, conditioned on a particular event A with P(A) > 0, is 
defined by 


_ pdx = x}nA) 
Dx\a(X) = P(A) 


where »1, DCX = x}N A) = P(A). 
Note that the events ({X = x} A) are disjoint for different values of x. 


= Example: 
Let X be the roll of a fair six-sided die and let A be the event that the roll is an even number. Then, by 


applying the preceding formula, we obtain the PMF of X|A, 
PAX = x} NA) 


Dxja(k) = a7 as = 1,2:.,6 


Then the PMF of X|A is 


1 
Pxja(k) = 43 Cae 
0 otherwise 
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6- Conditioning 


¢ Conditioning a random variable on another: 


Let X and Y be two random variables associated with the same experiment. If we know that the 
value of Y is some particular y with py(y) > 0, this provides partial knowledge about the value of 


X. This knowledge is captured by the conditional PMF pyy of X given Y, which is defined for 
(x,y) € E* such that 


Dxiy (xly) = P(X = x|¥ = y) 
Using the definition of conditional probabilities, we have 


Pxy(%,y) 


Px (x|y) = py(Y) 


Let us fix some y with py(y) > 0, and consider pyjy(x|y) as a function of x. This function is a valid 
PME of X: it assigns nonnegative values to each possible x, and these values add to 1, hence 


Pa Gely) = 1 
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6- Conditioning 


¢ Conditioning a random variable on another: 


=» Exercise: 


Consider a transmitter that is sending messages over a network. Let us define the following two random variables: 
¢ X: the travel time of a given message, 
e Y:the length of the given message. 


We know the PMF of the travel time of a message that has a given length, and we know the PMIF of the message 
length. We want to find the (unconditional) PMF of the travel time of a message. In fact, we assume that the 
length of a message can take two possible values y = 100 with probability 5/6 and y = 10000 with probability 
1/6. 


We assume that the travel time X depends on its length Y. In particular, the travel time is 10~*Y seconds with 
probability %, 10~?Y seconds with probability 1/3, and 10~2Y seconds with probability 1/6. What is the PMF of 
X? 
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6- Conditioning 


¢ Conditional Expectation: 
A conditional expectation is the same as an ordinary expectation, except that it refers to the new 
universe, and all probabilities and PMFs are replaced by their conditional counterparts. 


Let X and Y be random variables associated with the same experiment . 


= The conditional expectation of X given an event A with P(A) > 0, is defined by 
ELXIA] = > xpyiae) 


x 


= Fora function g(X) , we have 


ELg(X)Al= > g()Pxja) 


x 
= The conditional expectation of X given a value y of Y is defined by 
ELXIY =yl =) xpxy ly) 


x 
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6- Conditioning 


¢ Conditional Expectation: 
The three equalities apply in different situations, but are essentially equivalent, and will be referred 


as the total expectation theorem. 


Let X and Y be random variables associated with the same experiment . 
¢ If A,,...,4, be disjoint events that form a partition of the sample space, with P(A;) > 0 for all i, then 


ELX] =) PCADELXIAI 


i 
Furthermore, for any event B with P(A; N B) > 0 for alli, we have 


E[X|B] = ny P(A, |B)E[X|A; AB] 


¢ We also have 


EX] = ) py) EXIY =yI 


Probability & Statistics -Pr. Hajar El Hammouti - Computer Science School 


Chapter 2: Discrete Random Variables 


6- Conditioning 


¢ Conditional Expectation: 
Let us verify this equality 


If A,,...,4, be disjoint events that form a partition of the sample space, with P(A;) > 0 for alli, 
then 


ELX] = > P(A)ELXIAII 
i 
Use the total probability theorem that can be expressed as below for the proof 


px) =) PEADPxia, 4d 
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6- Conditioning 
¢ Conditional Expectation: 
= Exercise: 


Messages transmitted by a computer in Boston through a data network are destined for New York with probability 


0.5, for Chicago with probability 0.3, and for San Francisco with probability 0.2. The transit time X of a message 
is random. Its mean is 0.05 seconds if it is destined for New York, 0. 1 seconds if it is destined for Chicago, and 0.3 
seconds if it is destined for San Francisco. What is the mean of the transit time ? 
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¢ Independence: 


We now discuss concepts of independence related to random variables. These are analogous to the 
concepts of independence between events. They are developed by simply introducing suitable events 
involving the possible values of various random variables, and by considering the independence of these 
events. 


ndependence of a Random Variable roman Event: 
S independence of a random variable tro eC similar to the independence of two events. The 


idea is that knowing the occurrence of the seaman event provides no new information on the 
value of the random variable. More formally, we say that the random variable X is independent of the 
event A if 


P({X = x}and A) = P(A)py(x) 
Moreover, from the definition of the conditional probability, we have 


P({X = x}and A) = P(A)pyja(x) 
so that as long as P(A) > 0, independence is the same as the condition, and we have for a random 


variable to be independent of an event A 
Dy = Dy a(X 
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¢ Independence of two Random Variables : 


We say that two random variables X and Y are independent if for all x and y 
Pxy(%,y) = Px(X) py(y) 


Proposition 


If X and Y are two independent random variables, then 
E[XY] = E[X]E[Y] 


and 
Var(X + Y) = Var(X) + Var(Y) 


= Proof: 
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= Example (Binomial as a sum of independent Bernoulli): 


We consider n independent coin tosses, with each toss having probability p of coming up a head. For 

each i, we let X; be the Bernoulli random variable which is equal to 1 if the i —th toss comes up a head, 

and is 0 otherwise. Then, X = X; + ---+ X, is a binomial random variable. Its mean is E[X] = np, as 

derived before. By the independence of the coin tosses, the random variables Xj, ...,X, are independent 

, thus, the variance of the binomial can be computed, using the formula from the previous slide, easily as 
Var(X) = np(1 — p) 


= Exercise: 


Alice passes through four traffic lights on her way to work, and each light is equally likely to be green or red . 
independent of the others. 

(a) What is the PMF, the mean, and the variance of the number of red lights that Alice encounters? 

(b) Suppose that each red light delays Alice by exactly two minutes. What is the variance of Alice’s delay time? 
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¢ Summary of the main concepts + formulas of this course: 
= Probability Mass Function (PMF):py(x) = P({X = x}) such that >. -py(x) =1 


=» Well-known PMEs: Bernoulli, binomial, geometric, Poisson, uniform + building the histogram of a random variable 
= Expectation, Variance, standard deviation and moments: 
E[X] = YverXPx(x) (center of gravity), Var(X) = E[(X — E[X])?] = E[X?] — E[X/? (dispersion), oy =./Var(X) 
=" Expectation of a function of a random variable: 

ELgX) = ) g@dpx() 

xeEE 

= Joint and marginal PMF: 
Dyy (x, y) = PX =x,Y =y) = PX = x} N{Y = y}), px(&) = Ly Dxy(x,y) and Dy(y) = x Dxy(%y) 
= Conditional Expectation: E[X|Y = y] = Dy xpyy(«ly) 
= Independence:E[XY|] = E[X]E[Y], Var(X + Y) = Var(X) + Var(Y) 
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